Two Multivariate Generalizations of Pointwise Mutual Information
نویسنده
چکیده
Since its introduction into the NLP community, pointwise mutual information has proven to be a useful association measure in numerous natural language processing applications such as collocation extraction and word space models. In its original form, it is restricted to the analysis of two-way co-occurrences. NLP problems, however, need not be restricted to twoway co-occurrences; often, a particular problem can be more naturally tackled when formulated as a multi-way problem. In this paper, we explore two multivariate generalizations of pointwise mutual information, and explore their usefulness and nature in the extraction of subject verb object triples.
منابع مشابه
Probability Mass Exclusions and the Directed Components of Pointwise Mutual Information
The pointwise mutual information quantifies the mutual information between events x and y from random variable X and Y . This article considers the pointwise mutual information in a directed sense, examining precisely how an event y provides information about x via probability mass exclusions. Two distinct types of exclusions are identified—namely informative and misinformative exclusions. Then...
متن کاملQuantum mutual information and quantumness vectors for multi-qubit systems
Characterization of the correlations of a multiparticle system remains an open problem. We generalize the notion of quantum discord, using multivariate mutual information, to characterize the quantum properties of a multiparticle system. This new measure, called dissension, is a set of numbers – quantumness vector. There are a number of different possible generalizations. We consider two of the...
متن کاملMeasuring Multivariate Redundant Information with Pointwise Common Change in Surprisal
The problem of how to properly quantify redundant information is an open question that has been the subject of much recent research. Redundant information refers to information about a target variable S that is common to two or more predictor variables X i. It can be thought of as quantifying overlapping information content or similarities in the representation of S between the X i. We present ...
متن کاملThe Partial Entropy Decomposition: Decomposing multivariate entropy and mutual information via pointwise common surprisal
Obtaining meaningful quantitative descriptions of the statistical dependence within multivariate systems is a difficult open problem. Recently, the Partial Information Decomposition (PID) was proposed to decompose mutual information (MI) about a target variable into components which are redundant, unique and synergistic within different subsets of predictor variables. Here, we propose to apply ...
متن کاملCollocation Extraction beyond the Independence Assumption
In this paper we start to explore two-part collocation extraction association measures that do not estimate expected probabilities on the basis of the independence assumption. We propose two new measures based upon the well-known measures of mutual information and pointwise mutual information. Expected probabilities are derived from automatically trained Aggregate Markov Models. On three colloc...
متن کامل